Using Infinispan as JPA-Hibernate Second Level Cache Provider

Overview

Following some of the principles set out by Brian Stansberry in Using JBoss Cache 3 as a Hibernate 3.5 Second Level Cache and taking in account improvements introduced by Infinispan, an Infinispan JPA/Hibernate second level cache provider has just been developed. This wiki will explain how to configure JPA/Hibernate to use the Infinispan and for those keen on lower level details, the key design decisions made and differences with previous JBoss Cache based cache providers.

If you're unfamiliar with JPA/Hibernate Secone Level Caching, I'd suggest you have a read to Chapter 2.1 in this guide which explains the different types of data that can be cached.

Configuration

1. First of all, to enable JPA/Hibernate second level cache with query result caching enabled (Note : Query result caching, or for that matter entity caching, may not improve application performance. Be sure to benchmark your application with and without caching.), add either of the following:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.cache.use_second_level_cache" value="true" />
<property name="hibernate.cache.use_query_cache" value="true" />

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.cache.use_second_level_cache">true</property>
<property name="hibernate.cache.use_query_cache">true</property>

2. Now, configure the Infinispan cache region factory using one of the two options below:

• If the Infinispan CacheManager instance happens to be bound to JNDI select JndiInfinispanRegionFactory as the cacheregion factory and add the cache manager's JNDI name:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.cache.region.factory_class" value="org.hibernate.cache.infinispan.JndiInfinispanRegionFactory" />
<property name="hibernate.cache.infinispan.cachemanager" value="java:CacheManager" />

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.cache.region.factory_class">org.hibernate.cache.infinispan.JndiInfinispanRegionFactory</property>
<property name="hibernate.cache.infinispan.cachemanager">java:CacheManager/entity</property>

JBoss Application Server

JBoss Application Server 6 and 7 deploy a shared Infinispan cache manager that can be used by all services, so when trying to configure applications with Infinispan second level cache, you should use the JNDI name for the cache manager responsible for the second level cache. By default, this is "java:CacheManager/entity". In any other application server, you can still deploy your own cache manager and bind the CacheManager to JNDI, but in this cases, it's generally preferred that the following method is used.

• If running JPA/Hibernate and Infinispan standalone or within third party Application Server, select InfinispanRegionFactory as the cache region factory:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.cache.region.factory_class" value="org.hibernate.cache.infinispan.InfinispanRegionFactory"/>

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.cache.region.factory_class">org.hibernate.cache.infinispan.InfinispanRegionFactory</property>

This is all the configuration you need to run JPA/Hibernate with the Infinispan cache provider with the default settings. This should suit the majority of use cases but sometimes, further configuration is required and to help with such situations, please check the following section where more advanced settings are discussed.

Default Configuration Explained

The aim of this section is to explain the default settings for each of the different global data type (entity, collection, query and timestamps) caches, why these were chosen and what are the available alternatives.

Defaults for Entity/Collection Caching

By default, for all entities and collections, whenever an new entity or collection is read from database and needs to be cached, it's only cached locally in order to reduce intra-cluster traffic. This option cannot be changed.
By default, all entities and collections are configured to use a synchronous invalidation as clustering mode. This means that when an entity is updated, the updated cache will send a message to the other members of the cluster telling them that the entity has been modified. Upon receipt of this message, the other nodes will remove this data from their local cache, if it is stored there. This option can be changed to use replication by configuring entities or collections to use "replicated-entity" cache but it's not recommended.
By default, all entities and collections have initial state transfer disabled since there's no need for it. It it's not recommended that this is enabled.
By default, entities and collections are configured to use READ_COMMITTED as cache isolation level. It would only make sure to configure REPEATABLE_READ if the application evicts/clears entities from the Hibernate Session and then expects to repeatably re-read them in the same transaction. Otherwise, the Session's internal cache provides a repeatable-read semantic. If you really need to use REPEATABLE_READ, you can simply configure entities or collections to use "entity-repeatable" cache.
By default, entities and collections are configured with eviction settings:
- Eviction wake up interval is 5 seconds.
- Max number of entries are 10.000
- Max idle time before expiration is 100 seconds

You can change these settings on a per entity or collection basis or per individual entity or collection type.

More information in the "Advanced Configuration" section below.

Finally, by default, both entites and collections are configured with lazy deserialization which helps deserialization when entities or collections are stored in isolated deployments. If you're sure you'll never deploy your entities or collections in classloader isolated deployment archives, you can disable this settting.

Defaults for Query Caching

By default, query cache is configured so that queries are only cached locally . Alternatively, you can configure query caching to use replication by selecting the "replicated-query" as query cache name. However, replication for query cache only makes sense if, and only if, all of this conditions are true:
- query is quite expensive
- query is very likely to be repeated in different cluster nodes
- and query is unlikely to be invalidated out of the cache (Note: Hibernate must agressively invalidate query results from the cache any time any instance of one of the entity classes involved in the query's WHERE clause changes. All such query results are invalidated, even if the change made to the entity instance would not have affected the query result)
By default, query cache uses the same cache isolation levels and eviction/expiration settings as for entities/collections .
By default, query cache has initial state transfer disabled . It is not recommended that this is enabled.

Defaults for Timestamps Cache

By default, timestamps cache is configured with asynchronous replication as clustering mode. Local or invalidated cluster modes are not allowed, since all cluster nodes must store all timestamps. As a result, no eviction/expiration is allowed for timestamp caches either .
By default, timestamps is configured with a cluster cache loader (in Hibernate 3.6.0 or earlier it had state transfer enabled) so that joining nodes can get all timestamps. You shouldn't attempt disable the cluster cache loader for timestamps cache.

JTA Transactions Configuration

It is highly recommended that Hibernate is configured with JTA transactions so that both Hibernate and Infinispan cooperate within the same transaction and the interaction works as expected.

Otherwise, if Hibernate is configured for example with JDBC transactions, Hibernate will create a Transaction instance via java.sql.Connection and Infinispan will create a transaction via whatever TransactionManager returned by hibernate.transaction.manager_lookup_class. If hibernate.transaction.manager_lookup_class has not been populated, it will default on the dummy transaction manager. So, any work on the 2nd level cache will be done under a different transaction to the one used to commit the stuff to the database via HIbernate. In other words, your operations on the database and the 2LC are not treated as a single unit. Risks here include failures to update the 2LC leaving it with stale data while the database committed data correctly. It has also been observed that under some circumstances where JTA was not used, commit/rollbacks are not propagated to Infinispan.

To sum up, if you configure Hibernate with Infinispan, apply the following changes to your configuration file:

1. Unless your application uses JPA, you need to select the correct Hibernate transaction factory via hibernate.transaction.factory_class property:

If you're running within an application server, it's recommended that you use:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.transaction.factory_class" value="org.hibernate.transaction.CMTTransactionFactory"/>

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.transaction.factory_class">org.hibernate.transaction.CMTTransactionFactory</property>

If you're running in a standalone environment and you wanna enable JTA transaction factory, use:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.transaction.factory_class" value="org.hibernate.transaction.JTATransactionFactory"/>

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.transaction.factory_class">org.hibernate.transaction.JTATransactionFactory</property>

The reason why JPA does not require a transaction factory class to be set up is because the entity manager already sets it to a variant of CMTTransactionFactory.

2. Select the correct Hibernate transaction manager lookup:

If you're running within an application server, select the appropriate lookup class according to "JTA Transaction Managers" table. Example, if you were running with JBoss Application Server, you'd select:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.transaction.manager_lookup_class" 
   value="org.hibernate.transaction.JBossTransactionManagerLookup"/>

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.transaction.manager_lookup_class">
   org.hibernate.transaction.JBossTransactionManagerLookup
</property>

If you're running standalone and you want to add a JTA transaction manager lookup, things get a bit more complicated. Due to a current limitation, Hibernate does not support injecting a JTA TransactionManager or JTA UserTransaction that are not bound to JNDI. In other words, if you want to use JTA, Hibernate expects your TransactionManager to be bound to JNDI and it also expects that UserTransaction instances are retrieved from JNDI. This means that in an standalone environment, you need to add some code that binds your TransactionManager and UserTransaction to JNDI. With this in mind and with the help of one of our community contributors, we've created an example that does just that: JBoss Standalone JTA Example. Once you have the code in place, it's just a matter of selecting the correct Hibernate transaction manager lookup class, based on the JNDI names given. If you take JBossStandaloneJtaExample as an example, you simply have to add:
```

<property name="hibernate.transaction.manager_lookup_class" 
   value="org.hibernate.transaction.JBossTransactionManagerLookup"/>
```
```

<property name="hibernate.transaction.manager_lookup_class">
   org.hibernate.transaction.JBossTransactionManagerLookup
</property>
```
As you probably have noted through this section, there wasn't a single mention of the need to configure Infinispan's transaction manager lookup and there's a good reason for that. Basically, the code within Infinispan cache provider takes the transaction manager that has been configured at the Hibernate level and uses that. Otherwise, if no Hibernate transaction manager lookup class has been defined, Infinispan uses a default dummy transaction manager.

Since Hibernate 4.0, the way Infinispan hooks into the transaction manager can be configured. By default, since 4.0, Infinispan interacts with the transaction manager as an JTA synchronization, resulting in a faster interaction with the 2LC thanks to some key optimisations that the transaction manager can apply. However if desired, users can configure Infinispan to act as an XA resource (just like it did in 3.6 and earlier) by disabling the use of the synchronization. For example:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.cache.infinispan.use_synchronization"  value="false"/>

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.cache.infinispan.use_synchronization">
   false
</property>

Standalone JTA for JPA/Hibernate using Infinispan as 2LC

The JBoss standalone JTA example referred to in the previous section is inspired by the great work of Guenther Demetz, one of the members of the Infinispan community, who wrote an impressive wiki explaining how to set up standalone JTA with different transaction managers running outside of an EE server, and how to get this to work with an Infinispan backed JPA/Hibernate application.

Advanced Configuration

Infinispan has the capability of exposing statistics via JMX and since Hibernate 3.5.0.Beta4, you can enable such statistics from the Hibernate/JPA configuration file. By default, Infinispan statistics are turned off but when these are disabled via the following method, statistics for the Infinispan Cache Manager and all the managed caches (entity, collection,...etc) are enabled:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.cache.infinispan.statistics" value="true"/>

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.cache.infinispan.statistics">true</property>

The Infinispan cache provider jar file contains an Infinispan configuration file, which is the one used by default when configuring the Infinispan standalone cache region factory. This file contains default cache configurations for all Hibernate data types that should suit the majority of use cases. However, if you want to use a different configuration file, you can configure it via the following property:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.cache.infinispan.cfg" 
   value="/home/infinispan/cacheprovider-configs.xml"/>

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.cache.infinispan.cfg">
   /home/infinispan/cacheprovider-configs.xml
</property>

For each Hibernate cache data types, Infinispan cache region factory has defined a default cache name to look up in either the default, or the user defined, Infinispan cache configuration file. These default values can be found in the Infinispan cache provider javadoc. You can change these cache names using the following properties:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.cache.infinispan.entity.cfg" 
   value="custom-entity"/>
<property name="hibernate.cache.infinispan.collection.cfg" 
   value="custom-collection"/>
<property name="hibernate.cache.infinispan.query.cfg" 
   value="custom-collection"/>
<property name="hibernate.cache.infinispan.timestamp.cfg" 
   value="custom-timestamp"/>

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.cache.infinispan.entity.cfg">
   custom-entity
</property>
<property name="hibernate.cache.infinispan.collection.cfg">
   custom-collection
</property>
<property name="hibernate.cache.infinispan.query.cfg">
   custom-collection
</property>
<property name="hibernate.cache.infinispan.timestamp.cfg">
   custom-timestamp
</property>

One of the key improvements brought in by Infinispan is the fact that cache instances are more lightweight than they used to be in JBoss Cache. This has enabled a radical change in the way entity/collection type cache management works. With the Infinispan cache provider, each entity/collection type gets each own cache instance, whereas in old JBoss Cache based cache providers, all entity/collection types would be sharing the same cache instance. As a result of this, locking issues related to updating different entity/collection types concurrently are avoided completely.

This also has an important bearing on the meaning of hibernate.cache.infinispan.entity.cfg and hibernate.cache.infinispan.collection.cfg properties. These properties define the template cache name that should be used for all entity/collection data types. So, with the above hibernate.cache.infinispan.entity.cfg configuration, when a region needs to be created for entity com.acme.Person, the cache instance to be assigned to this entity will be based on a "custom-entity" named cache.

On top of that, this finer grained cache definition enables users to define cache settings on a per entity/collection basis. For example:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.cache.infinispan.com.acme.Person.cfg" 
   value="person-entity"/>
<property name="hibernate.cache.infinispan.com.acme.Person.addresses.cfg" 
   value="addresses-collection"/>

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.cache.infinispan.com.acme.Person.cfg">
   person-entity
</property>
<property name="hibernate.cache.infinispan.com.acme.Person.addresses.cfg">
   addresses-collection
</property>

Here, we're configuring the Infinispan cache provider so that for com.acme.Person entity type, the cache instance assigned will be based on a "person-entity" named cache, and for com.acme.Person.addresses collection type, the cache instance assigned will be based on a "addresses-collection" named cache. If either of these two named caches did not exist in the Infinispan cache configuration file, the cache provider would create a cache instance for com.acme.Person entity and com.acme.Person.addresses collection based on the default cache in the configuration file.

Furthermore, thanks to the excellent feedback from the Infinispan community and in particular, Brian Stansberry, we've decided to allow users to define the most commonly tweaked Infinispan cache parameters via hibernate.cfg.xml or persistence.xml, for example eviction/expiration settings. So, with the Infinispan cache provider, you can configure eviction/expiration this way:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.cache.infinispan.entity.eviction.strategy" 
   value= "LRU"/>
<property name="hibernate.cache.infinispan.entity.eviction.wake_up_interval" 
   value= "2000"/>
<property name="hibernate.cache.infinispan.entity.eviction.max_entries" 
   value= "5000"/>
<property name="hibernate.cache.infinispan.entity.expiration.lifespan" 
   value= "60000"/>
<property name="hibernate.cache.infinispan.entity.expiration.max_idle" 
   value= "30000"/>

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.cache.infinispan.entity.eviction.strategy">
   LRU
</property>
<property name="hibernate.cache.infinispan.entity.eviction.wake_up_interval">
   2000
</property>
<property name="hibernate.cache.infinispan.entity.eviction.max_entries">
   5000
</property>
<property name="hibernate.cache.infinispan.entity.expiration.lifespan">
   60000
</property>
<property name="hibernate.cache.infinispan.entity.expiration.max_idle">
   30000
</property>

With the above configuration, you're overriding whatever eviction/expiration settings were defined for the default entity cache name in the Infinispan cache configuration used, regardless of whether it's the default one or user defined. More specifically, we're defining the following:

All entities to use LRU eviction strategy
The eviction thread to wake up every 2000 milliseconds
The maximum number of entities for each entity type to be 5000 entries
The lifespan of each entity instance to be 600000 milliseconds
The maximum idle time for each entity instance to be 30000

You can also override eviction/expiration settings on a per entity/collection type basis in such way that the overriden settings only afftect that particular entity (i.e. com.acme.Person) or collection type (i.e. com.acme.Person.addresses). For example:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.cache.infinispan.com.acme.Person.eviction.strategy" 
   value= "FIFO"/>
<property name="hibernate.cache.infinispan.com.acme.Person.eviction.wake_up_interval" 
   value= "2500"/>
<property name="hibernate.cache.infinispan.com.acme.Person.eviction.max_entries" 
   value= "5500"/>
<property name="hibernate.cache.infinispan.com.acme.Person.expiration.lifespan" 
   value= "65000"/>
<property name="hibernate.cache.infinispan.com.acme.Person.expiration.max_idle" 
   value= "35000"/>

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.cache.infinispan.com.acme.Person.eviction.strategy">
   FIFO
</property>
<property name="hibernate.cache.infinispan.com.acme.Person.eviction.wake_up_interval">
   2500
</property>
<property name="hibernate.cache.infinispan.com.acme.Person.eviction.max_entries">
   5500
</property>
<property name="hibernate.cache.infinispan.com.acme.Person.expiration.lifespan">
   65000
</property>
<property name="hibernate.cache.infinispan.com.acme.Person.expiration.max_idle">
   35000
</property>

The aim of these configuration capabilities is to reduce the number of files needed to modify in order to define the most commonly tweaked parameters . So, by enabling eviction/expiration configuration on a per generic Hibernate data type or particular entity/collection type via hibernate.cfg.xml or persistence.xml, users don't have to touch to Infinispan's cache configuration file any more. We believe users will like this approach and so, if you there're any other Infinispan parameters that you often tweak and these cannot be configured via hibernate.cfg.xml or persistence.xml, please let the Infinispan team know by sending an email toinfinispan-dev@lists.jboss.org

Please note that query/timestamp caches work the same way they did with JBoss Cache based cache providers. In other words, there's a query cache instance and timestamp cache instance shared by all. It's worth noting that eviction/expiration settings are allowed for query cache but not for timestamp cache. So configuring an eviction strategy other than NONE for timestamp cache would result in a failure to start up.

Finally, from Hibernate 3.5.4 and 3.6 onwards, queries with specific cache region names are stored under matching cache instances. This means that you can now set query cache region specific settings. For example, assuming you had a query like this:

Query query = sessionFactory.getCurrentSession().createQuery(
  "select account.branch from Account as account where account.holder = ?");
query.setCacheable(true);
query.setCacheRegion("AccountRegion");

The query would be stored under "AccountRegion" cache instance and users could control settings in similar fashion to what was done with entities and collections. So, for example, you could define specific eviction settings for this particular query region doing something like this:

<!-- If using to JPA, add to your persistence.xml -->
<property name="hibernate.cache.infinispan.AccountRegion.eviction.strategy" 
   value= "FIFO"/>
<property name="hibernate.cache.infinispan.AccountRegion.eviction.wake_up_interval" 
   value= "10000"/>

<!-- If using to Hibernate, add to your hibernate.cfg.xml -->
<property name="hibernate.cache.infinispan.AccountRegion.eviction.strategy">
   FIFO
</property>
<property name="hibernate.cache.infinispan.AccountRegion.eviction.wake_up_interval">
   10000
</property>

Integration with JBoss Application Server

In JBoss Application Server 7, Infinispan is the default second level cache provider and you can find details about its configuration the AS7 JPA reference guide.

Infinispan based Hibernate 2LC was developed as part of Hibernate 3.5 release and so it currently only works within AS 6 or higher. This is due to limitations within Hibernate 3.5 which is not designed to work with AS/EAP 5.x or lower. To be able to run Infinispan based Hibernate 2LC in a lower AS version such as 5.1, the Infinispan 2LC module would require porting to Hibernate 3.3.

Recently, William Decoste has helped migrate the Infinispan 2LC module to Hibernate 3.3, and in this wiki, he explains how to integrate Infinispan Hibernate 2LC with JBoss AS/EAP 5.x.

Using Infinispan as remote Second Level Cache?

Lately, several questions (here & here) have appeared in the Infinispan user forums asking whether it'd be possible to have an Infinispan second level cache that instead of living in the same JVM as the Hibernate code, it resides in a remote server, i.e. an Infinispan Hot Rod server. It's important to understand that trying to set up second level cache in this way is generally not a good idea for the following reasons:

The purpouse of a JPA/Hibernate second level cache is to store entities/collections recently retrieved from database or to maintain results of recent queries. So, part of the aim of the second level cache is to have data accessible locally rather than having to go to the database to retrieve it everytime this is needed. Hence, if you decide to set the second level cache to be remote as well, you're losing one of the key advantages of the second level cache: the fact that the cache is local to the code that requires it.
Setting a remote second level cache can have a negative impact in the overall performance of your application because it means that cache misses require accessing a remote location to verify whether a particular entity/collection/query is cached. With a local second level cache however, these misses are resolved locally and so they're much faster to execute than with a remote second level cache.

There're however some edge cases where it might make sense to have a remote second level cache, for example:

You are having memory issues in the JVM where JPA/Hibernate code and the second level cache is running. Off loading the second level cache to remote Hot Rod servers could be an interesting way to separate systems and allow you find the culprit of the memory issues more easily.
Your application layer cannot be clustered but you still want to run multiple application layer nodes. In this case, you can't have multiple local second level cache instances running because they won't be able to invalidate each other for example when data in the second level cache is updated. In this case, having a remote second level cache could be a way out to make sure your second level cache is always in a consistent state, will all nodes in the application layer pointing to it.
Rather than having the second level cache in a remote server, you want to simply keep the cache in a separate VM still within the same machine. In this case you would still have the additional overhead of talking across to another JVM, but it wouldn't have the latency of across a network. The benefit of doing this is that:
- Size the cache separate from the application, since the cache and the application server have very different memory profiles. One has lots of short lived objects, and the other could have lots of long lived objects.
- To pin the cache and the application server onto different CPU cores (using numactl), and even pin them to different physically memory based on the NUMA nodes.

Frequently Asked Questions

To find out more please go to the Hibernate 2nd level cache section in the Technical FAQ wiki.

JBoss Community Archive (Read Only)

Infinispan 5.1